home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Risc World 3
/
Risc World 3.iso
/
SOFTWARE
/
ISSUE4
/
ZAP
/
!Zap
/
Code
/
Extensions
/
ExtAsm
/
ExtBasDoc
< prev
Wrap
Text File
|
2002-05-08
|
22KB
|
673 lines
Extended BASIC Assembler
————————————————————————
Version 2.01, 12 June 2000 (BASIC 1.05-1.16)
2.01 patch 2, 14 August 2000 (BASIC 1.20)
by Darren Salt <ds@youmustbejoking.demon.co.uk>
AOF code by Paul Clifford <paul@plasma.demon.co.uk>
Contains code from v1.00 and v1.30 by Adrian Lees
(This is a beta version.)
Licence
———————
ExtBASICasm is freeware. No restrictions are placed on its use; what you do
with it is your own responsibility.
Release versions of ExtBASICasm may be freely distributed in unchanged form;
patched versions may not be distributed unless the patches are *clearly*
documented; such documentation must be distributed with the patched module.
Beta versions of ExtBASICasm must not be redistributed.
Intro
—————
It is recommended that you read through this document before first using this
version of ExtBASICasm, and check through it quickly when upgrading. It's
entirely possible that one of the extras the module provides may cause a
clash with existing programs, for example the APCS-R register names clashing
with variables used as register names.
Note also that APCS-R register names are disabled by default.
The module ExtBASICasm provides a patch for various versions of BBC BASIC V
to allow the direct use of the extra instructions provided by the ARM3, ARM6,
ARM7 and StrongARM processors. The missing floating-point and general
coprocessor instructions, and some assembler directives more familiar (and a
few unfamiliar) to Acorn Assembler users have been added; also the APCS-R
register names may be used. Also, AOF files may be generated.
To make the necessary changes to the BASIC module it must be located in RAM.
The ExtBASICasm module will therefore attempt to RMFaster the BASIC module
which will require a small amount of memory in the RMA, in addition to that
required by the ExtBASICasm module itself. Attempting to run it while BASIC
is active and in ROM will not work - try "*RMFaster BASIC" at the BASIC
prompt and you'll see why.
Ver OS Supported?
1.05 3.1x Yes
1.06 3.5x Yes
1.14 3.6x Yes
1.16 3.7x Yes
1.17 3.8x Yes
1.18 4.00 No (1)
1.19 4.01 No (1)
1.20 4.0x Yes
(1) Support for these versions will not be added since v1.20 is available for
softload on RISC OS 4.
Enabling ExtBASICasm
————————————————————
Unlike some earlier versions, this version is initialised into a dormant
state whenever you start up the BASIC interpreter, eg. by double-clicking on
a BASIC program or by typing BASIC at the * prompt.
You can enable or disable the extensions by using the assembler pseudo-op
EXT n
where n is 0 to disable and 1 to enable. (Other values are currently mapped
to 1; do not rely on this.)
Setting any of the extension OPT bits *NO* *LONGER* enables ExtBASICasm.
Certain extensions remain enabled at all times: specifically, ALIGN always
zero-fills, and the ".foo = bar" bug remains fixed. I don't think that
this'll inconvenience anybody :-)
ExtBASICasm uses the BASIC data word TIMEOF, which is documented as "unused"
for all versions of BASIC V which it recognises, for its 'enabled' flag. It
also uses the byte at &86E4 for its extended options byte.
The instructions added by the module are as follows:
Extensions
——————————
Optional parts are enclosed in []
OPT [<value>]
Bit 4: ASSERT control (1 = enabled on 'second pass')
Bit 5: APCS register names (1 = enabled)
Bit 6: UMUL/UMULL control (0 = short forms, 1 = long forms)
Bit 7: AOF control. If set, then the AOF extension is enabled.
When ExtBASICasm is disabled, these bits take on their standard
meanings: bit 4 allows use of, amongst others, the FP instructions.
If value is omitted, the previous setting is used.
EXT <flag>
<flag>,<value>
<flag>,
Initialises or disables the extensions according to flag. The second
form allows you to simultaneously set the OPT value; the third (a
side effect of the use of the OPT code) causes the previous OPT value
to be used.
ALIGN
Zero-initialises the memory if required.
ALIGN <const>[,<const2>]
Aligns to a multiple of const bytes plus an optional offset. const
must be a power of 2 between 1 and 65536; const2 must be between 0
and const-1 (default is 0). Also zero-initialises the memory.
P% becomes (P%+const-1 AND const-1)+const2; O% is also updated if
necessary. Examples:
ALIGN 4
ALIGN 32
ALIGN 16,8
MUL{cond}{S} Rd,Rm,#<const>
variable length; Rd=Rm if <2 ADD/RSB
May cause 'duplicate register' if Rd=Rm and const is not simple - ie.
not 0, (2^x)-1, 2^x, (2^x)+(2^y)
MLA{cond}{S} Rd,Rm,#<const>,Ra
variable length; Rd=Rm if <2 ADD/RSB
Rd=Ra causes 'duplicate register' error if const is not simple, as
for MUL; Rd=Rm=Ra is special in that MLA Rd,Rd,#c,Rd = MUL Rd,Rd,#c+1
If Rd=Ra and const=0, no code is generated (none necessary).
DIV Rq,Rr,Rn,Rd,Rt [SGN Rs]
Integer division by register
Rq = quotient Rn = numerator Rt = temporary store
Rr = remainder Rd = denominator Rs = sign store
If Rs omitted then division is unsigned.
Rr may be same register as Rn *or* Rn may be same as Rs.
All other registers must be different.
Rt and Rs (if specified) are corrupted.
DIV Rq,Rr,Rn,#d[,[Rt]] [SGN Rs]
Integer division by constant
Registers as above
If Rs omitted then division is unsigned.
If Rt omitted and is required for this division then error given.
All registers must be different.
If specified, Rt and Rs are corrupted.
(Uses generator to build code - fast but may be long)
Notes: Uses Fourier method. For unsigned values, this is fixed to
handle unsigned top-bit-set properly, *except* for div by 3
which works for values up to &C0000000. Ideas and code
gratefully received...
*** Note no conditional for DIV
SQR Rt,Rr,Rx,Ry
SQR Rt,Rr,{Rx,Ry}
Square root
Input Rr = value, Rx & Ry = work registers
Output Rt = square root, Rr = remainder, Rx = 0, Ry corrupt
In effect, Rt = INT SQR Rr and Rr' = Rr-Rt*Rt
If {} are used, then Rx and Ry are preserved via STMFD R13!,{Rx,Ry}
and restored via LDMFD R13!,{Rx,Ry}.
*** Note no conditional for SQR
ADR{cond}L Rd,<const>
Fixed length (two words)
ADR{cond}X Rd,<const>
Fixed length (three words)
ADR{cond}W Rd,<const>
Addressing relative to R12, one to three words
<const> MUST be defined before it is used
Adds/subtracts const to/from R12, storing result in Rd
Up to you to ensure that R12 correctly set up...
LDR, STR
xxx{cond}{B}W Rd,<offset>
Load/store word/byte at [R12,#<offset>]
LDR{cond}{B}L Rd,<address>
LDR{cond}{B}L Rd,[Rm,#<offset>]{!}
LDR{cond}{B}WL Rd,<offset>
STR equivalents
Addressing range is ±1MB; some offsets outside this range are also
valid. Lengths are (in words):
LDR 2 ADD/SUB Rd,Rm,#a:LDR Rd,[Rd,#b]
LDR ...]! 2 ADD/SUB Rm,Rm,#a:LDR Rd,[Rm,#b]!
STR 3 ADD/SUB Rm,Rm,#a:STR Rd,[Rm,#b]:SUB/ADD Rm,Rm,#a
STR ...]! 2 ADD/SUB Rm,Rm,#a:STR Rd,[Rm,#b]!
LDR{cond}{B}L Rd,{Rn},<address>
LDR{cond}{B}L Rd,{Rn},[Rm,#<offset>]
LDR{cond}{B}WL Rd,{Rn},<offset>
STR equivalents
[{Rn} is NOT optional]
Equivalent to the LDR/STRs above, except that Rn (rather than Rd)
is used to hold the address; always two words long. For example,
ADRL R0,wibble:LDR R1,[R0] may be replaced with LDRL R1,{R0},wibble
- one word shorter.
Rd=Rn is not allowed.
Assembles to ADD/SUB Rn,Rm,#a : LDR/STR Rd,[Rn,#b]!
LDR{cond}{B}L Rd,[Rm],#<offset>
STR equivalent
Addressing range is ±1MB; some offsets outside this range are also
valid. Two words long.
Assembles to LDR/STR Rd,[Rm],#b:ADD/SUB Rm,Rm,#a
NOTE: You should try to avoid using *sequences* of LDRLs or STRLs -
there is usually a more efficient way.
LDRxxH, LDRxxSH, LDRxxSB and STRxxH
The W forms are supported.
Long LDR{H|SH|SB} not yet implemented.
SWAP{cond}[S] Rd,Rn
Swaps Rd and Rn without using temporary store.
Uses EOR method, is therefore three words long.
If S is specified, then the flags are set according to Rn.
VDU{cond}{X} <const>
= SWI "OS_WriteI"+<const>
With X present, XOS_WriteI is used instead.
NOP{cond}
= MOV{cond} R0,R0
(BASIC supports only an unconditional NOP.)
BRK{cond} [#<const>]
Undefined instruction. If <const> is specified, then R14 is set to
this value before the undefined instruction trap is taken.
EQUx, DCx, =
xxx <value>[,<value>]^
Extended form of EQUD, EQUW, DCB, etc.
Instead of, eg. DCD 0 : DCD 12 : DCD branch
you can now use DCD 0, 12, branch
Negative constants
Allowed in the following instructions:
ADD, SUB ADC, SBC ADF, SUF
AND, BIC MOV, MVN MVF, MNF
CMP, CMN CMF, CNF CMFE, CNFE
If the constant is invalid for one of these, it is negated or
inverted, as appropriate, and the instruction changed to the other of
the pair (eg. ADC becomes SBC). If the constant is still invalid, the
"bad immediate constant" error is generated as normal.
ARMv3 (ARM6) and later
——————————————————————
MSR{cond} <psr>_<f>,Rm
(Standard extension.)
<f> may, in addition to the standard combinations of 'c' 'x' 's' 'f',
be one of:
ctl control bits only
flg flag bits only
all both
Any combination of 'cf' is equivalent to 'all'; you may also, in the
standard form, use '_' between letters.
ARMv4 (ARM8, StrongARM) and later
—————————————————————————————————
UMUL, SMUL, UMLA, SMLA:
xxx{cond}{S} Rl,Rh,Rm,Rn
The 'official' forms UMULL, SMULL, UMLAL, SMLAL are used *instead of*
the 'short' forms if extended OPT bit 6 is set.
Unfortunately it's not possible to allow both forms at once: how
would you interpret "UMULLS" - UMUL condition LS or UMULL with S bit?
Floating-point instructions
———————————————————————————
Floating point coprocessor data transfer
LDF, STF:
xxx{cond}precW Fd,<offset>
LFM{cond}{stack} Fd,m,[Rn]{!}
SFM{cond}{stack} Fd,m,[Rn]{!}
LFS{cond}{stack} Rn{!},<fp register list>
LFS{cond}{stack} Rn{!},<fp register list>
LFM, SFM, LFS and SFS use extended precision. The <fp register list>
is much as for LDM and STM, with restrictions: you must specify a
register or a sequence of registers, and the list must be compatible
with LFM and SFM - eg.
LFSFD R13!,{F3} LFMFD F3,1,[R13]! LFM F3,1,[R13],#12
SFSFD R13!,{F5-F0} SFMFD F5,4,[R13]! SFM F5,4,[R13,#-36]!
LFSDB R13,{F1,F0} LFMDB F0,2,[R13] LFM F0,2,[R13,#-24]
- for each row, all the instructions have the same effect.
Available stack types are DB, IA, EA, FD.
Note that example 2 wraps around - F5, F6, F7, F0 _in that order_.
Assembler directives
————————————————————
* Conditional - will STOP if expression is FALSE:
ASSERT <expression>
Bit 4 of the extended OPT value controls ASSERT. When it and bit 1
are zero, ASSERTs are ignored.
* Constants
= <const|string>
The bug causing an error when used in the form
.label = "something"
has been fixed.
EQUF, DCF
xxx <const>
Synonyms for EQUFD.
EQUP, DCP, P
xxx <string>,<const>
xxx <const>,<string>
Fixed-length string allocation. If the string is too short, then the
remaining space is padded with nulls; if it is too long, it is
truncated to the specified length.
EQUPW, DCPW, PW
xxx <pad_byte>,<string>,<const>
xxx <pad_byte>,<const>,<string>
Like EQUP, except that you specify the padding byte.
EQUZ, DCZ, Z
xxx <string>
EQUS with automatic zero termination
EQUZA, DCZA, ZA
xxx <string>
Equivalent to EQUZ followed by ALIGN
Note: *ALL* the EQU... directives (and their equivalents) may have
their arguments repeated as described in the Extensions section.
FILL, %
xxx{B|W|D} <const>{,{<value>}}
Allocates <const> bytes of memory, initialised to <value> (or 0).
B, W and D represent data lengths as for EQU; if omitted, then byte
length is assumed. If the comma is present but no fill value, this is
equivalent to adding the constant to P% (and O% if appropriate).
FILE <filename>
Loads the specified file, allocating just enough space for it.
^ <offset>
Initialises the workspace address pointer to the given value.
This is used and updated by #.
Typical use:
^ 0
...
# flags, 4
...
LDRW R0,flags
# <variable>, <length>
Sets the variable to the current value of the workspace address
pointer, which is then incremented by <length>.
This does not alter P% or O%.
(Note: the variable is assigned before the length is evaluated.)
COND <cond>
Sets the condition code for use with = (when used as a condition
code). It may be supplied as a condition code literal, a number (0 to
15), or a string containing a condition code literal. For example,
all of the following are equivalent:
COND 7 ; number
COND VC ; condition code literal
COND vc ; condition code literal
COND "Vc" ; string containing cond. code lit.
Example code:
COND LT ; select LT condition code
MOV= R0,#2 ; MOVLT R0,#2
MOV=S R1,R2 ; MOVLTS R1,R2
AOF generation
——————————————
AREA "name", "attributes"
An area is a named block of code or data that can be manipulated by a
linker, such as drlink, to form the final program image. Typically, a
program will be divided into two main areas; one for code and the
other for data. For example, Acorn's C compiler uses areas named
"C$$code" and "C$$data".
Each area has a set of attributes which provide extra information to
the linker. The attributes recognised by ExtBASICasm are listed below
(the case is ignored):
32BIT The code complies to the 32-bit variant of the APCS.
CODE Contains machine code instructions.
DATA Contains data, not instructions.
EXTFP The extended floating-point instruction set is used (LFM
and SFM instead of multiple LDFEs and STFEs).
NOCHECK The code complies with a variant of the APCS without
software stack-limit checking.
PIC Position Independent Code; will execute where loaded
without modification.
READONLY This area will not be written to.
REENTRANT The code complies with the reentrant APCS standard.
Each attribute should be separated from the next by a comma (spaces
are optional and ignored when processing the list). It is important
to make sure that all areas of the same name also have the same
attributes, otherwise the linking process will fail.
Example area definitions:
AREA "C$$code", "CODE, READONLY"
AREA "MyProgram$Data", "DATA"
IMPORT "symbol" ["attributes"]
IMPORT "symbol", "alternative name" ["attributes"]
References between areas are handled via "symbols". Each area exports
a list of symbols that are available for external use, eg "strlen",
"printf", and "fread" from the C library stubs. An area can then
import the symbol and use it as if it were defined locally, leaving
the linker to resolve the references later.
IMPORT makes an external symbol available for use within the current
area. It creates a variable of the same name, ending with an @, which
can be used with STR, LDR, EQUD, EQUW and B instructions. An
alternative name can be supplied, making it possible to import
symbols containing characters that are illegal in a BASIC variable
name, such as $. Example uses:
IMPORT "strcpy" ; import strcpy as strcpy@
IMPORT "an$example", "example" ; import an$example as example@
BL strcpy@ ; call the strcpy routine
LDR R0, example@ ; load a word from an$example
Note: The '@' variables should always be treated as read-only.
The optional attributes are as follows:
FPREG This is only meaningful if the imported symbol is a
function entry point and indicates that floating point
arguments will be passed in floating point registers.
INSENSITIVE The linker will ignore the case when trying to resolve
the reference.
NOCASE A synonym for INSENSITIVE.
WEAK It is acceptable for the reference to remain unsatisfied.
Example:
IMPORT "xyz", "abc" ["nocase"] ; import xyz as abc@, ignoring case
EXPORT "symbol" ["attributes"]
EXPORT "symbol", "alternative name" ["attributes"]
EXPORT makes an address within the current area available for outside
use. The "symbol" is the BASIC variable containing the address and,
if the alternative name is missing, the name under which it is
exported. If the variable is an integer you would probably want to
supply an alternative name to remove the % at the end. Example uses:
EXPORT "compare%", "uint_cmp" ; export compare% as uint_cmp
EXPORT "value", "an$integer" ; export value as an$integer
EXPORT "value" ; also export value as value
.compare%
CMP R0, R1
MOVLO R0, #-1
MOVEQ R0, #0
MOVHI R0, #1
MOVS PC, R14
.value
DCD &12345678
The optional attributes are:
DATA This is only meaningful if the symbol occurs in a code area,
and indicates that the symbol defines data rather than code.
FPREG This is only meaningful if the symbol defines a function
entry point and indicates that floating point arguments
should be passed in floating point registers.
LEAF The symbol defines a function call that makes no calls to any
any other functions.
STRONG This symbol should be used in preference to any other
non-strong symbol when resolving references in other files.
Any references to the symbol within the same file as the
strong definition resolve to the non-strong definition. This
allows a kind of link-time indirection.
Example:
EXPORT "leaf_fn" ["leaf"] ; export leaf_fn as a leaf symbol
INCLUDE "filename"
INCLUDE will load the specified BASIC program as a library (first
pass only) and call the definition of FNinclude in that program, if
one exists. The function call ignores definitions of FNinclude
elsewhere, such as other INCLUDE'd files.
END
SAVE "filename"
It is necessary to explicitly mark the end of an AOF file, to allow
the extra data to be inserted in the correct place, and to provide a
means of determining how many passes have been carried out. Either
END or SAVE "filename" can be used for this purpose; the latter will
automatically save the AOF output on the second pass. For example:
SAVE "o.program"
LDR register, =expression
LDF* register, =expression
LTORG
Literals are used to load immediate values that cannot be handled by
the MOV/MVN and MVF/MNF instructions. The expression is evaluated and
stored in the nearest following literal pool, and the instruction is
assembled to load from the value from this address. If the expression
contains an imported symbol, the necessary relocation directive will
be transparently added.
A literal pool is automatically added at the end of each area, but
extra literal pools can be created using the LTORG directive. This is
particularly useful when using floating point literals as the LDF
instruction only has a range of +-1020.
Examples:
LDR R0, =&4b534154 ; R0 = &4b534154 ("TASK")
LDFS F1, =3.1415926536 ; F1 = (float) 3.1415926536
LDR R7, =external@ + 4 ; R7 = address of external + 4
HEAD "function name"
This adds a function name header, used by backtraces and
disassemblers to name functions. For example:
EXPORT "compare"
HEAD "compare"
.compare
; ...
MOVS PC, R14
ENTRY
The ENTRY directive is used to tell the linker where program
execution should begin.
ORG address
ORG sets a base address for the current area. It should be used
carefully as it may cause problems for the linker, and only really
makes sense if the code needs to be mapped to a fixed hardware
location.
ROUT
ROUT "routine name"
ROUT marks the beginning of a new local label block. Local labels can
be defined multiple times in a single source file and are
particularly useful in macros. For example:
DEF FNexample
[ OPT pass%
ROUT
TST R0, #1
BNE @10
; ...
B @20
.10
; ...
.20
]
= 0
A local label always starts with a number and can optionally be
followed by the routine name, as supplied to the preceding ROUT. It
is an error to supply differing names; local labels outside the
current label block are hidden. References to a local label begin
with an @.
Notes
—————
* Registers are specified in the following form:
ARM registers: R0..R15
using APCS-R names: A1..A4 V1..V6 SL FP IP SP LR PC
Floating-point registers: F0..F7
General co-processor registers: C0..C15
To help to cope with any potential name clashes, the floating point and
APCS-R register names (except for PC) must be terminated with some
character not valid in a variable name in order to be recognised; they are
otherwise treated as part of a variable name.
* Coprocessor numbers (CP#) may be specified using either of the following
forms:
P0..P15
CP0..CP15
* Wherever a register or coprocessor number is specified, an expression may
be substituted in the usual manner allowed by BASIC V. This module employs
the routines used within BASIC to evaluate all expressions (eg. register
numbers, offsets and labels) and hence its interpretation of expressions is
guaranteed to be the same as BASIC.
Credits
———————
Adrian Lees (last known, AFAIK, at A.M.Lees-CSEE93@@cs.bham.ac.uk):
- for the original ExtBas and the EQU comma extension, and for the use of
some of his code
Michael Rozdoba (mroz@argonet.co.uk; remember TechForum?):
- for including the "General recursive method for Rb := Ra * C, C a
constant" from Appendix C of the manual for Acorn's desktop assembler,
and the late Acorn Computing (Sept 1994) for printing it;
- for the division code generator (Archimedes World, May 1995), which was
included, slightly trimmed, and debugged to handle top-bit-set unsigned
numbers properly... I hope!
Dominic Symes of !Zap fame (dominic.symes@armltd.co.uk):
- for pointing out that ANDEQ R0,R0,R0 could usefully be replaced by DCD 0
Martin Willers (m.willers@tu-bs.de):
- for bug hunting :-)
Reuben Thomas (rrt1001@cam.ac.uk):
- for pointing out it might be useful to disable the APCS register names,
suggesting B/W/D suffix for FILL (and %) and -ve immediate constants, and
bug encountering
Mohsen Alshayef (mohsen@qatar.net.qa):
- for some useful long MUL, STRH and [CS]PSR info
Michael Kircher (kircher@ph-cip.uni-koeln.de):
- for the integer square root code, and a few bug reports